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Abstract. Given strings P of length m and T of length n over an al- 
phabet of size a, the string matching with fc-mismatches problem is to 
find the positions of all the substrings in T that are at Hamming dis- 
tance at most k from P. If T can be read only one character at the 
time the best known bounds are 0(n^/klogk) and 0(n + ny/k/w log fc) 
in the word-RAM model with word length w. In the RAM models (in- 
cluding AC and word-RAM) it is possible to read up to (9(u>/ log a) 
characters in constant time if the characters of T are encoded us- 
ing [logcr] bits. The only solution for fc-mismatches in packed text 
works in 0((n log a j log n) \m log(fc + log n/ log <r)/w] + rf) time, for any 
e > 0. We present an algorithm that runs in time 0{ \ v ,>(J?i oga \ \ (1 + 
log min(fc, a) log m/ log a)) in the AC model if m — 0(w/ loga) and 
T is given packed. We also describe a simpler variant that runs in time 

^( [m/(m"iog a)\ log mm ( m ; log w / log fJ )) m the word-RAM model. The al- 
gorithms improve the existing bound for w 3> log n. Based on the intro- 
duced technique, we present algorithms for several other approximate 
matching problems. 

1 Introduction 

The string matching problem consists in reporting all the occurrences of a pat- 
tern P of length m in a text T of length n, both strings over a common alphabet. 
The occurrences may be exact or approximate according to a specified match- 
ing model. For most matching problems, all the characters from the text and 
the pattern need to be read at least once in the worst case; hence, if they are 
read one at a time, the worst-case lower bound is f2(n). Interestingly, for some 
standard problems (e.g., exact pattern matching) it is possible to achieve a sub- 
linear search time, for short patterns, even in the worst case, if the word-RAM 
computational model is assumed and the text is packed. In a packed encoding, 
the characters of a string are stored adjacently in memory and each character is 
encoded using logcr bits 4 , where a is the alphabet size. A single machine word, 



4 Throughout the paper, all logarithms are in base 2. W.l.o.g. we also assume that a 
is a power of two. 



of size w > logn bits, thus contains up to a = [w/logcrj characters. For most 
of the following considerations we assume to have a short pattern, i.e, one which 
fits in a word (m < a). 

For this setting and the exact string matching problem, several sublinear-time 
algorithms have been given in recent years [12, 7, 5, 6, 8]. 

In this paper we study the string matching with fc-mismatches problem in 
the packed scenario. This problem is to find the positions of all the substrings in 
T that are at Hamming distance at most fc from P, i.e., that match P with at 
most fc mismatches. For this problem, the best known bounds in the worst-case 
are O^y^kTogk) of the algorithm by Amir et al. [2] and 0(n + n^k/wlogk) 
of its implementation based on word- level parallelism [13]. One classical result 
in the word- RAM model that is also practical is the Shift- Add algorithm [4]. 
The best worst-case bound of this algorithm, based on the Matryoshka counters 
technique [14], is 0(n\m/w~\). 

In [12] Fredriksson presented a Shift-Add variant, based on the super- 
alphabet technique, that works in 0((nlogCT/logn)[TOlog(fc + logn/logcr)/-u;] + 
n £ ) time, for any e > 0. To our knowledge, this is the only solution for the fc- 
mismatches problem that works on packed text and that achieves sublinear time 
complexity when m and k are sufficiently small. 

In this work, we present an algorithm for the fc-mismatches problem that 
runs in time 0( u^jt^^ti (1 + logmin(fc, a) log mj logc)) in the AC model for 
m < a if T is given packed. We also describe a simpler variant that runs in time 
^( LuV(roiogo-)] l°g mm ( m i log w/ log a)) in the word-RAM model. In particular, 
it achieves sublinear worst-case time when mlogcrlogmin(m, logw/loger) = 
o(w). Note that for w = 0(\ogn) Frcdriksson's solution is better, but our al- 
gorithm dominates if w > logn, which becomes a fairly standard assumption 
nowadays. 



2 Basic notions and definitions 

Let £ = {0, 1, . . • , a— 1} denote an integer alphabet and £ m the set of all possible 
sequences of length m over E. S[i],i > 0, denotes the (i + l)-th character of string 
5*, and S[i . . . j] its substring between the (i + l)-st and the (j + l)-st characters 
(inclusive) . 

The fc-mismatches problem consists in, given a pattern (string) P of length 
m and a text (string) T of length n, reporting all the positions < j < n — m 
such that |{0 < h < m : T[j + h] ^ -P[^]}| < fc, i-e., such that the Hamming 
distance between P and the substring T[j . . . j + m — 1] is at most fc. 

The word-RAM model is assumed, with machine word size w = J?(logn). 
We use some bitwise operations following the standard notation as in C lan- 
guage: &, |, A , <~, <<, >> for and, or, xor, not, left shift and right shift, 
respectively. 



3 The algorithm 



We start the presentation with a simple idea, which is then extended and mod- 
ified in some ways. We define an (/)-word as a machine word logically divided 
into [w/f\ fields of / bits. The most significant bit in a field is called the top bit. 
We also define, for a given field size /, the (constant) word Vf — lO^" 1 . . . 10^ _1 . 
Consider two (log <r)-words A and B, each containing a packed string of length 
m in its mlogcr least significant bits (i.e., each field of loger bits encodes a char- 
acter). The higher bits in both words, if any, are all Os (where there can be 
no misunderstanding, we will silently assume such convention about used and 
non-used bits from now on). We perform the xor operation of A and B and the 
number of non-zero fields in the result is exactly the Hamming distance between 
the two strings. To have this calculation efficient, we make use of two primitives 
on a machine word: 

— find non-zero fields (fnf(A, /)): given a (/)-word A, return a word in which 
a bit is set to 1 if it is the top bit of a non-zero field in A and to otherwise. 

— sideways addition (sa(A)): given a word A, return the number of bits set in 
A. 

How to implement the first primitive in O(l) time was presented in [8, Sect. 4], 
but we will give a simpler procedure. The second primitive is a well-known 
bitwise operation, also known as popcount. The folklore method 5 to compute 
it has O(loglogw) time complexity. The procedure to compute the Hamming 
distance of A and B can thus be implemented with the following operations: 

1. A' ^A A B 

2. A' <r- fnf(A',loga) 

3. return sa(A') 

Using this method, we can obtain an algorithm for the string matching with 
k- mismatches problem that runs in 0(n\m log o~/w~\ log log w) time. Note that 
the resulting algorithm is also practical and compares favorably with the clas- 
sical Shift- Add algorithm [4] for small alphabets and large k, although it is less 
flexible (no support for classes of characters). It is also worth noting that recent 
processors include a POPCNT instruction to compute the sideways addition of 
a word, so the log log w term disappears in practice. 

We now explain how to implement the fnf primitive, in three simple steps 
and in constant time. The input is a (log cr)-word A. 

1. W<r- A&~Mog<r 

2. A 1 <-Vi og „-W 

3. A'^{~A'\ A)kV loga 

5 http: //graphics . Stanford. edu/~ seander/bithacks .html# 
CountBitsSetParallel 



We now define two variants of the sideways addition and another known 
primitive on words that we will use in a refined variant of our technique. 

interleaved blockwise sideways addition (\bsa(A, f,b)): Given an (/)-word A, with 
/ < logcr and such that only the top bit of each field may be set, and a power 
of two b, return the word A[ ogb , where 

A , = + « x lo § ff )) if 3 > o 

j 1 A » (/ - 1) otherwise 

This operation computes the sums of all the sequences of b /-bit fields spaced 
by logtr bits and is a variant of the parallel prefix-sum operation described 
in [15]. Each sum is stored into the last field of the corresponding sequence. 
Since / > log(6 + 1) does not necessarily hold, the top bits of all the fields are 
masked out before each addition and restored afterwards. In this way, if a sum 
is > 2-^~ 1 , its encoded value is > 2$~ 1 but the exact value is undetermined. This 
operation can be implemented in time (9(log&). 

blockwise sideways addition (bsa(j4, /, b)): Given an (/)-word A, such that only 
the top bit of each field may be set, and a power of two b, return a word in 
which each block of bf bits contains the number of bits set in the corresponding 
b fields in A. 

This operation can be implemented in time (3(logmin(6, log w/f)) using the 
following method. We assume that the word size w is a power of two. Let r be the 
smallest power of two greater than or equal to log(w+l)//. The first step consists 
in computing a word logically divided into fields of min(6, r)f bits, such that 
each field contains the number of bits set in the corresponding min(6, r) fields in 
the original word. This widening operation can be performed in logmin(6, r) = 
0(logmin(6, logw/f)) steps using simple bitwise operations. 

Since both r and b are a power of two, each block spans an integral number of 
fields of min(6, r)f bits. Observe that there can be at most w bits set in a word, 
so rf bits are enough to encode the total number of bits. If b < r, then since b 
is a power of two after the last widening step we have a word divided into fields 
of bf bits, each one containing the desired number of ones. Otherwise, if b > r, 
we use a multiplication with the mask r -^ _1 l . . . r -^ _1 l to store into each field 
the sum of the previous fields including itself. This corresponds to computing 
the prefix sum of the sequence of numbers given by the fields. It is not hard to 
see that, after the multiplication, the number of bits in a block is equal to the 
last field of the block minus the last field of the previous block. This operation 
can be implemented in parallel for all the blocks with a shift and a subtraction. 

Observe that the claimed bound holds only if we assume that multiplication 
is 0(1), which is true in the word- RAM model but not in the AC model. In 
the AC model the bsa operation can be performed in 0(log b) time by applying 
O(logb) widening steps. 

parallel minima (maxima) [17] (pmin (pmax)): Given two (&)-words A and B, 
return a word in which a bit is set to 1 if it is the top bit of a block in A whose 



value is smaller (larger) than or equal to the value of the corresponding block in 
B and to otherwise, pvmin (pvmax) is similar, but instead of setting the top 
bits, returns the actual minimum (maximum) value pair-wise for each held. 

These operations can be implemented in constant time, as demonstrated by the 
following code (pmin) 

1. T A <- A k V b 

2. T B <- BkV b 

3. A' <- A k ~ V b 

4. A" <- (B | V b ) - A' 

5. H x <- ~A" k~T A kT B 

6. H 2 <- A" k ~T A k ~T B 

7. if 3 <- A" & ~ T A k T B 

8. tf 4 <- A" kT A kT B 

9. A'" <- (Fr | H 2 | As | ^4) & H 

We now show how to apply the described ideas in an (improved) algorithm 
for the fc-mismatches problem on packed text for short patterns (to < a). Our 
method exploits a general technique [16] to increase the parallelism in string 
matching algorithms based on word-level parallelism. We first present a solution 
in the AC model, and then describe a simpler variant that is better in the 
word-RAM model. Let to be the smallest power of two greater than or equal 
to to and let I — [w/(fh log a) J . We first preprocess the pattern P to create a 
word A with t copies of P of length to log a starting from the least significant 
bit. The last fh — m fields of each copy are set to zero. We also maintain a 
word H, initialized to 0, that will be used during the searching phase. Let Bi be 
the word containing the packed encoding of the substring T[j . . . j + £m — 1], 
where j = l\i/m\m + i mod to, with m — m zero (padding) fields every to fields. 
Note that the word Sj, for i > 0, can be computed incrementally from 
and the packed text in constant time, using simple bitwise operations. Let k be 
the smallest power of two greater than k. Our search algorithm performs the 
following main steps, for each < i < njl: 

1. A' <- A A B t 

2. A 1 -s— fnf(j4', logcr) 

3. H <-(H « /) | A' 

4. if i > and i mod [\oga/f\ = 

5. M 4- pmin(ibsa(iJ, /, fh), K, f) 

6. report(M) 

7. H <- 

where / = min(logfc + 1, logcr) and if is a word with £[\oga/f\ copies of the 
integer k spaced by /(to — 1) bits. At each iteration, our algorithm processes 
£ substrings of T in parallel using the technique to compute the Hamming dis- 
tance of two words described before. However, we report the occurrences every 
Llogcr//J iterations, so as to reduce the overhead due to counting the number 



of mismatches for each substring when logfc = o(loger). To this end, we com- 
pact the fields in the word fnf(A A Bi, log a) into fields of size / in the word 
H. If i > and i mod |_loger//J = 0, i.e., every l\\oga/ f\ processed substrings, 
we report the occurrences as follows. First, observe that the word H contains 
^m[logCT//J fields of / bits, encoding the mismatches for the substrings of T 
of length m corresponding to the words £?j_j, for j = 0, . . . , [loga/fj — 1. 
More precisely, the Z-th sequence of fnf(A A B^jAoga) spans the fields 
s,s+ [loga/f },..., s + [log a If J (m - 1), where s = jf + (I - l)m|_log a/f\ , 
for I — 1,...,£. Using a suitable algorithm, i.e, the ibsa operation, we com- 
pute a word such that the last field of each sequence has value equal to the 
number of bits set (mismatches) in all the fields of the sequence if the num- 
ber of mismatches is less than k and to a value > k otherwise. Then, to find 
all the occurrences with at most k mismatches we use the pmin operation with 
the word K to identify the blocks with a bit count less than or equal to k. 
Finally, to iterate over all the occurrences we use the well-known bitwise oper- 
ation that computes the position of the highest bit set in a word. Observe that 
this operation is in AC and takes constant time [3]. The time complexity of 
this algorithm is 0( ^ u ,/( m " ogg -)j (1 + logmin(fc, a) logm/logcr)). It obtains the 

L«i/(miog<r)j ) bound, corresponding to no overhead for the bitwise operations, 
if logmin(fc, a) logm = O(logcr). 

We now present the simpler variant in the word-RAM model. Let r be the 
smallest power of two greater than or equal to \og(w + l)/log<7. The algorithm 
performs the following main steps, for each < i < n/i: 

1. A' <- A A B t 

2. A 1 <- fnf(A',loga) 

3. M <- pmin(bsa(A',loger, m),^, /) 

4. report(M) 

where in this case / = min(r, to) log a and K contains £ copies of k spaced 
by to log (7 — f bits. The main difference in this algorithm is that we do not 
defer the reporting of occurrences. Rather, at each iteration we use the bsa 
operation, which is cheap in this model, to compute into fields of / bits the 
number of bits set (mismatches) in each block of to fields of A'. Observe that 
in this setting bsa has 0(logmin(TO, log w/ logo - )) time complexity. Hence, our 
algorithm has 0( |„,/( m " ogcr )j logmin(TO, log w/ logo - )) time complexity, and it 
obtains the 0( [ w /( m " og(T )j ) bound for loger = J?(logw) or constant to. 

Finally, we give a variant useful for two extreme cases: either k or to— k is very 
small. More precisely, it is competitive when k = o(logmin(fc, a) log to/ logc) or 
to — k = o(logmin(fc,cr) logm/logcr). It uses only AC instructions. In this 
variant, each pattern copy in A has p = 1 + to logo- associated bits. The most 
significant (extra) bit is a sentinel that will signal that there are more than k 
mismatches, as will be shown shortly. The remaining to log a bits encode the 
pattern as usual. The idea is to parallelize the well-known sideways addition 



implementation in which the least significant bit set is cleared in a loop 6 . To 
this end, we perform the following procedure: 



1. A 1 <- A A B, 

2. A' <— fnf(^4',loger) 

3. for i <- 1 to k + 1 

4. A' <- A' | F p 

5. A' ^ A' k {A' - (V p » (p - 1))) 

6. M <- (A' & V p ) A 

7. report(M) 

After the last operation we obtain set bits only in the positions corresponding 
to pattern copies with at most k mismatches. That is, in those cases even the 
last subtraction does not clear the sentinel bit. The complexity of the described 
operation is 0(k). The time complexity of this algorithm is 0( y w /^\o S a)\ ^ 
twin solution handles the case of small m — k, which basically consists in using 
the same method on the bitwise complement of the top bits of A'. 



4 Applications 

The presented technique can be used for several other string matching problems. 
We show how to adapt it for particular models in the following subsections. 

4.1 Matching with fe-mismatches and wildcards 

Assume that the alphabet S, of size u, contains a wildcard symbol, i.e., a special 
symbol that matches any other symbol of the alphabet. Hence we count the mis- 
matches only for regular characters [9] . We consider the case in which wildcards 
may occur in both the pattern and the text. 

In the preprocessing we create, in 0(log(w;/(mlog<7))) time using only AC 
instructions, or even in 0(1) time in the word- RAM, two (log <r)-words Wp and 
Hp- The word Wp contains Ifh fields. A field in Wp has the top bit set to if 
the field corresponds to a wildcard position in the word A, to 1 otherwise. All 
the other bits are set to 0. The word Hp contains Ifh copies of the wildcard 
symbol. 

At each iteration i of the searching phase, we compute the word Wt — 
fnf(Bj A Ht, log a). Analogously to Wp, a field in Wt has the top bit set to if 
the field corresponds to a wildcard position in B, to 1 otherwise. All the other 
bits are set to 0. 

Then, we and the result of operation 2 of the algorithm with Wp & Wp 
(i.e., A' A' & (Wp & Wt)), which effectively means that there can be no 
mismatch in a position where either a pattern or text wildcard occurs. The rest 
of the procedure is unchanged. The overall time complexity is also unchanged, 
apart from the initial (negligible) 0(log(w/(mlogcr))) term. 

6 http: //graphics . Stanford. edu/~ seander/bithacks .html* 
CountBitsSetKernighan 



4.2 ^-matching with fc-mismatches and (<5, 7)-matching 

In <5-matching [10] any two characters t and p are defined to match iff \t — p\ < 5. 
Combined with Hamming distance, a text position j matches the pattern if 
|{0 < i < m : \T[j + i] - P[i]\ > <5j}| < k. Note that we can allow 5 t to 
be different for each pattern position i, while usually in (5-matching the allowed 
error 5 is the same for each character. This also yields an alternate solution to 
matching with wildcards in the pattern, by simply using <5, = a — 1 for pattern 
positions i corresponding to wildcards, and Si = elsewhere. 

In the preprocessing phase we compute D'[j] = 5j and construct its packed 
representation D, a (logcr)-word holding £ copies of D' . Let W be a (loger)-word 
with top bits set in fields corresponding to pattern characters. Then at iteration 
i of the searching phase compute 

1. X 4— pvmax(A, Bi, logo-) — pvmin(A, Bi, log a) 

2. A' <- pmin(X,L>,loga) A W 

The result is that in every field of A' the top bit is 1 iff the corresponding pattern 
and text characters do not (5-match. The rest of the algorithm is as before, only 
the steps 1-2 of either of the main algorithms (for AC and word-RAM models) 
are effectively replaced with the two steps above. The time complexities remain 
the same. 

If we are interested in the (more conventional) exact 5-matching variant (i.e. 
assume that k — 0), we can improve the time to 0{ y w /^/\oga)\ ) usm S the 
following algorithm: 

1. X 4- pvmax(A, Bi, log a) — pvmin(A, Bi, log a) 

2. A' 4- pmin(X,L>,logcr) A W 

3. M «- fnf(A',mlog<7) A V m i oga 

4. report(M) 

The fnf operation interprets the word A' as (m log cr)-word, and sets the top bit 
to 1 for each pattern copy where even one character did not <5-match, and the 
xor operation then inverts those top bits, giving bit 1 for pattern copies that did 
5-match in all character positions. 

We note that a close relative to (5-matching is less-than matching, where 
characters p and t match if p < t. This model has applications in other pattern 
matching problems, see e.g. [1]. Less-than matching problem can be solved in 
our methods easily in the same way as 5-matching, that is, the first two lines are 
simply replaced with A' pmin(A, Bi, logo - ) A W. It is also possible to solve 
5-matching by combining less-than and greater-than matching. 

In 7-matching we sum the absolute differences between character pairs, and 
limit this sum with a parameter 7, i.e. a text position j 7-matches the pattern if 
J2o<i<m + *] ~ P[i] \ < 7- 11 both S and 7 conditions must hold, we speak of 
(5, 7)-matching. There are many algorithms devoted to this model, see e.g. [11]. 
Thus in order to solve 7-matching, we need to sum the fields of X and compare 
against 7 (note that we need to check also the S condition). If we do not use 
field compaction and deferred reporting of occurrences, this is easiest to do the 



same way as the first widening phase of bsa operation, i.e. by simply shifting 
and adding the fields in parallel in O(logm) time, giving 0{ y w /^\ og(T ^ logm) 
total time in AC . This result can be improved in both AC and word- RAM 
models. We first define one more operation: 

interleave two words (interleave^, B, 5, /)): given two (/)-words A and B, inter- 
leave the fields according to (/)-word S. That is, return a word Z where a field 
is selected from A if the corresponding field in S is zero, and from B otherwise. 
This can be implemented in 0(1) time as follows: 

1. S^M(S,f) 

2. S^(S-(S»(f-l)))\S 

3. Z <- Ak~S\B&S 

Consider first the AC model. We use an approach similar to the one used in 
the fc-mismatches case, the only difference is that we need more bits to represent 
the accumulated sums. That is, we replace k with 7 = 2 riog("y-i-i)l (the smallest 
power of two greater than 7), and K with a word G, containing I [log cr//J copies 
of the integer 7, where / = min(log7 + 1, log a). The ibsa operation then works 
correctly, i.e. accumulates the absolute differences without overflowing the sums, 
provided that all characters 5-match, as otherwise the corresponding absolute 
difference may be a — i, which in turn can be larger than 7. To this end, we 
take care that we do not compute the exact sum if even one character pair does 
not (5-match. Instead, in such cases we assume that all the differences are S, to 
ensure that ibsa works correctly and that the text position will not be reported. 
This works because in 7-matching it always holds that mS > 7, as otherwise the 
7 condition does not prune anything. The complete pseudo code follows. 

1. X 4— pvmax(A, Bi, logcr) — pvmin(A, logo - ) 

2. A' <- pmin(X,L>,logcr) A W 

3. Z <- fnf(A',mloga) A V rlllog « 

4. A' <— interleave^, D, Z, m logcr) 

5. H <- (£T << /) I X' 

6. if i > and i mod [logcr/ f\ = 

7. M <- pmin(ibsa(#, /, m), G, /) 

8. report(M) 

9. H <- 

The total time is 0( ^/ (m " og<T) j (1 + logm log 7/ logcr)), This becomes 

Q ( L«,/(m W log g )j ) fOT lo g™ lo g7 - O(logcr). 

In word-RAM the algorithm is again more simple. We start as in the AC 
case to compute the word A', and then use bsa and pmin to accumulate the 
absolute differences and compare against the threshold value 7 in parallel. The 
only thing remaining is to adjust the parameters so as not to cause overflows in 
bsa. To this end, we use r = \log(wS + 1)]/ logcr and / = min(r, rh) logcr, and 
the algorithm is 



1. X 4— pvmax(yl, Bi, logc) — pvmin(A, Bi, logo - ) 

2. A' <- pmin(X,L>,logcr) A 

3. Z^fnf(A',mloga) A l^ log(7 

4. X' «- interleave(A, L>, Z, mloga) 

5. M pmin(bsa(A', logc, fn),G, /) 

6. report(M) 

where the word G contains I copies of the integer 7 this time, spaced by m log er— 
/ bits. The time complexity is 0( |^,/( m " loger )j logmin(TO, log(w5)/ log a)) which 
again obtains 0( , w // m " og a \ 1 ) time for log a = .^(log^^)) or constant m. 

We can also combine the two models, (^-matching with /c-mismatches and 
((5, 7)-matching, to obtain (<5, k, 7)-matching. In this model we limit the number 
of characters not ^-matching by k, and the accumulated sum of the absolute 
differences by 7. The basic idea is to compute two match vectors, Ms and M 7 , 
take their bitwise and asM<- Ms & M 1 and then report the occurrences with 
respect to M . The vector Ms can be computed as was already shown. To com- 
pute M 7 we just skip the interleave operation to take the absolute differences 
raw without saturating them with 5. In AC we can use the basic algorithm 
to compute M 7 in 0{ y w j^ los(T ^ logm) time, which dominates the total time. 
However, since the sums need more bits now, there is a significant overhead in 
the case of the word-RAM algorithm and of the improved AC algorithm. In 
particular, in the case of the word-RAM algorithm, the time complexity becomes 
^( lw/(m\o<y g-)j l°g mm ( m i log(wg-)/ l°g a )) ) i- c -: slightly worse. Instead, in the 
case of the improved AC algorithm, the overhead makes the algorithm useless. 
However, we can still manage to obtain 0( ^ w /^io ga -)] + l°g r7l l g7/l°g cr )) 
time by modifying the model so that we accumulate the absolute differences 
only on (5-matching character positions. This can be easily done with the tools 
already presented, namely using the interleave operation to set non-<5-matching 
character positions to in word X'. 

5 Conclusion 

We presented a novel technique for approximate pattern matching with k- 
mismatches when the text is given in packed form. Assuming the pattern is 
short enough, it is possible to achieve a sublinear search time, if several pattern 
copies are matched against different (but adjacent) text substrings at the same 
time. We described variants of our simple method in the AC and word-RAM 
models and also considered the case when the number k of allowed errors is small. 
Moreover, we showed how to adapt our algorithms to other matching models, 
including approximate matching with wildcard (don't-care) symbols, (5-matching 
with fc-mismatches and (S, 7)-matching. 
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